Understanding Sampling-based Adversarial Search Methods
نویسنده
چکیده
Until 2007, the best computer programs for playing the board game Go performed at the level of a weak amateur, while employing the same Minimax algorithm that had proven so successful in other games such as Chess and Checkers. Thanks to a revolutionary new sampling-based planning approach named Upper Confidence bounds applied to Trees (UCT), today's best Go programs play at a master level on full-sized 19 × 19 boards. Intriguingly, UCT's spectacular success in Go has not been replicated in domains that have been the traditional stronghold of Minimax-style approaches. The focus of this thesis is on understanding this phenomenon. We begin with a thorough examination of the various facets of UCT in the games of Chess and Mancala, where we can contrast the behavior of UCT to that of the better understood Minimax approach. We then introduce the notion of shallow search traps — positions in games where short winning strategies for the opposing player exist — and demonstrate that these are distributed very differently in different games, and that this has a significant impact on the performance of UCT. Finally, we study UCT and Minimax in two novel synthetic game settings that permit mathematical analysis. We show that UCT is relatively robust to misleading heuristic feedback if the noise samples are independently drawn, whereas systematic biases in a heuristic can cause UCT to prematurely " freeze " onto sub-optimal lines of play and thus perform poorly. We conclude with a discussion of the potential avenues for future work. Raghuram Ramanujan is a PhD candidate in Computer Science at Cornell University, where he has collaborated with Bart Selman and Ashish Sabharwal on research problems related to algorithms for computer game playing. Prior to Cornell, he was an undergraduate at Purdue University, where he earned a B.S. in Computer Engineering, with a minor in Economics. His interest in Artificial Intelligence was stoked by his undergraduate research work in machine learning and planning, that was carried out under the supervision of Robert Gi-van and Alan Fern. In the distant past, he spent a couple of years in Singapore completing his GCE 'A' Levels as an SIA Youth Scholar. Outside of academia, he is an accomplished nature and wildlife photographer whose work has been featured on the website of the Cornell Lab of Ornithology and in a field guide to birding in the Finger Lakes region. iii ACKNOWLEDGEMENTS This thesis would …
منابع مشابه
Understanding Sampling Style Adversarial Search Methods
UCT has recently emerged as an exciting new adversarial reasoning technique based on cleverly balancing exploration and exploitation in a Monte-Carlo sampling setting. It has been particularly successful in the game of Go but the reasons for its success are not well understood and attempts to replicate its success in other domains such as Chess have failed. We provide an in-depth analysis of th...
متن کاملOn the Behavior of UCT in Synthetic Search Spaces
UCT and Minimax are two of the most prominent tree-search based adversarial reasoning strategies for a variety of challenging domains, such as Chess and Go. Their complementary strengths in different domains have been the motivation for several works attempting to achieve a better understanding of their vastly different behavior. Rather than using complex games as a testbed for deriving indirec...
متن کاملOn Adversarial Search Spaces and Sampling-Based Planning
Upper Confidence bounds applied to Trees (UCT), a banditbased Monte-Carlo sampling algorithm for planning, has recently been the subject of great interest in adversarial reasoning. UCT has been shown to outperform traditional minimax based approaches in several challenging domains such as Go and Kriegspiel, although minimax search still prevails in other domains such as Chess. This work provide...
متن کاملOmputation and D Ecision - M Aking in L Arge E Xtensive F Orm G Ames
In this thesis, we investigate the problem of decision-making in large two-player zero-sumgames using Monte Carlo sampling and regret minimization methods. We demonstrate fourmajor contributions. The first is Monte Carlo Counterfactual Regret Minimization (MC-CFR): a generic family of sample-based algorithms that compute near-optimal equilibriumstrategies. Secondly, we develop a...
متن کاملSparse Sampling for Adversarial Games
This paper introduces Monte Carlo *-Minimax Search (MCMS), a Monte-Carlo search algorithm for finite, turned based, stochastic, two-player, zero-sum games of perfect information. Through a combination of sparse sampling and classical pruning techniques, MCMS allows deep plans to be constructed. Unlike other popular tree search techniques, MCMS is suitable for densely stochastic games, i.e., gam...
متن کاملAdversarial Texts with Gradient Methods
Adversarial samples for images have been extensively studied in the literature. Among many of the attacking methods, gradient-based methods are both effective and easy to compute. In this work, we propose a framework to adapt the gradient attacking methods on images to text domain. The main difficulties for generating adversarial texts with gradient methods are: (i) the input space is discrete,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012